Skip to content

Frequently asked questions

Gilles Quénot edited this page Apr 30, 2023 · 7 revisions

Getting the last node

Q: How do I get the last node? //foo//bar returns all bars, but I only want the last one, and //foo//bar[last()] did not work.

<div>
  <foo>
    <bar>First </bar>
    <bar>Second </bar>
  </foo>
  <foo>
    <bar>Third </bar>
    <bar>Fourth </bar>
  </foo>
</div>

A: //foo//bar[last()] would return the last bar of its parent, in the example Second and Fourth

You need (//foo//bar)[last()] to get the last of those.

Getting nodes with an attribute

Q: I want to extract the title attribute from links whose href contains the string "contentFile.aspx".

This command returns the href, but I do not know how to get the Title contents instead.

xidel http://www.coorong.sa.gov.au/page.aspx?u=1813 --xquery '//a/@href[contains(., "contentFile.aspx")]'

A: You can go back from the @href to the corresponding a:

xidel http://www.coorong.sa.gov.au/page.aspx?u=1813 --xquery '//a/@href[contains(., "contentFile.aspx")]/../@title'

Or you can put the condition on the a:

xidel http://www.coorong.sa.gov.au/page.aspx?u=1813 --xquery '//a[@href[contains(., "contentFile.aspx")]]/@title'

or

xidel http://www.coorong.sa.gov.au/page.aspx?u=1813 --xquery '//a[contains(@href, "contentFile.aspx")]/@title'

Getting nodes containing text

Q: How do you find tags which include a certain text?

A: You can use contains or matches on these nodes. E.g.

xidel input.html -e '//*[contains(., "searched text")]'

finds all nodes containing text as well as their ancestors, because a node containing a node containing text contains the text, too.

To find the nodes without ancestors, you can check only the direct text of the nodes:

xidel input.html -e '//*[text()[contains(., "searched text")]]'

This is also much faster, however texts that span multiple nodes are not found, e.g. in <span>foo<b>bar</b></span> either foo or bar can be found with text(), but not foobar.

When "searched text" is a regular expression, you can use matches in place of contains.

Replacing empty/null nodes

Q: How to return a default value, if the input is empty?

A: For inputs that have at most one value use:

(input, "default value")[1]

[1] returns the first value of a sequence, so it will return input if input exists. If input is empty, the sequence becomes ("default value")[1], so it will return "default value".

Deletion of nodes

Q: How do I delete the div from

<div>
    <span>I want to keep this</span>
    <div class="I_want_to_delete_this">
        <span>blah< blah/span>
    </div>
    <span>I want to keep this too</span>
</div>

to get something like

<div>
    <span>I want to keep this</span>
    <span>I want to keep this too</span>
</div>

?

A: All data is immutable, so you cannot delete something from a document, but you can create a new document without these nodes.

For example using the x:replace-nodes function:

xidel --xml -e 'x:replace-nodes(//div[@class="I_want_to_delete_this"],())' xx.xml 

Or x:transform-nodes function:

xidel -s input.xml -e '
  x:transform-nodes(
    /,
    function($x){
      if (name($x)="div" and $x[@class="I_want_to_delete_this"])
      then ()
      else $x
    }
  )
' --output-node-format=xml --output-node-indent

or

xidel -s input.xml -e '
  let $delete:=//div[@class="I_want_to_delete_this"] return
  x:transform-nodes(
    /,
    function($x){if ($delete[$x is .]) then () else $x}
  )
' --output-node-format=xml --output-node-indent

or

xidel -s input.xml -e '
  let $delete:=//div[@class="I_want_to_delete_this"] return
  x:transform-nodes(
    /,
    function($x){$x[not($delete[$x is .])]}
  )
' --output-node-format=xml --output-node-indent

Using Xidel in a shell pipeline | xidel

Q: Is there any way of processing output from another script in xidel, i.e. is there any option to tell xidel to grab the content like this: grep foobar test.html | xidel ...

A: If you give it a dash - as file name it reads the pipe input.

 grep foobar test.html | xidel - ...

Caveats

Also look here for things to avoid: https://github.com/benibela/xidel/wiki/Caveats